Rank | Count | Beginning |
---|---|---|
4792 | 29183 | A |
36555 | 9680 | Az |
58772 | 1606 | Ez |
51853 | 901 | Egy |
95396 | 782 | Története |
61548 | 693 | Ezt |
56906 | 593 | Ennek |
59813 | 517 | Ezek |
82389 | 428 | Mivel |
54306 | 422 | Élete |
66582 | 421 | Ha |
73302 | 412 | Később |
53620 | 388 | Ekkor |
63390 | 356 | Fekvése |
84355 | 346 | Nem |
70739 | 342 | Itt |
62289 | 336 | Ezután |
69523 | 320 | Így |
23665 | 318 | Amikor |
55960 | 315 | Első |
60636 | 312 | Ezen |
51072 | 309 | Ebben |
82080 | 309 | Miután |
47244 | 286 | Bár |
62870 | 271 | Ezzel |
94697 | 262 | Több |
73809 | 257 | Két |
85987 | 239 | Ő |
50202 | 228 | De |
51542 | 223 | E |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV